98 research outputs found

    Price indicators for Airbnb accommodations

    Get PDF
    New forms of hospitality grew increasingly more popular and successful during the last decades. Nowadays, they are chosen for different reasons, one of the most important certainly being price. Understanding the elements that can impact on price determination is crucial to increase profitability. We propose two price indicators for Airbnb accommodations, which are defined in three phases using proportional odds model as a reference model. The first phase focuses on the probability estimation of accommodations belonging to a specific class of price. The second phase aims to evaluate the ability of the model to make good predictions by computing three different indexes. Finally, the three indexes are combined to define the indicators q and r which evaluate, respectively, the impact that six different dimensions (transports, culture, crowd, property, management, and time) have with respect to price determination on Airbnb accommodations and their relative importance concerning neighborhoods. The analysis is focused on 61 neighborhoods of Rome. The findings show differences with respect to the impact of the dimensions on price for each neighborhood of Rome

    Evolutionary Algorithms in Decision Tree Induction

    Get PDF
    One of the biggest problem that many data analysis techniques have to deal with nowadays is Combinatorial Optimization that, in the past, has led many methods to be taken apart. Actually, the (still not enough!) higher computing power available makes it possible to apply such techniques within certain bounds. Since other research fields like Artificial Intelligence have been (and still are) dealing with such problems, their contribute to statistics has been very significant. This chapter tries to cast the Combinatorial Optimization methods into the Artificial Intelligence framework, particularly with respect Decision Tree Induction, which is considered a powerful instrument for the knowledge extraction and the decision making support. When the exhaustive enumeration and evaluation of all the possible candidate solution to a Tree-based Induction problem is not computationally affordable, the use of Nature Inspired Optimization Algorithms, which have been proven to be powerful instruments for attacking many combinatorial optimization problems, can be of great help. In this respect, the attention is focused on three main problems involving Decision Tree Induction by mainly focusing the attention on the Classification and Regression Tree-CART (Breiman et al., 1984) algorithm. First, the problem of splitting complex predictors such a multi-attribute ones is faced through the use of Genetic Algorithms. In addition, the possibility of growing “optimal” exploratory trees is also investigated by making use of Ant Colony Optimization (ACO) algorithm. Finally, the derivation of a subset of decision trees for modelling multi-attribute response on the basis of a data-driven heuristic is also described. The proposed approaches might be useful for knowledge extraction from large databases as well as for data mining applications. The solution they offer for complicated data modelling and data analysis problems might be considered for a possible implementation in a Decision Support System (DSS). The remainder of the chapter is as follows. Section 2 describes the main features and the recent developments of Decision Tree Induction. An overview of Combinatorial Optimization with a particular focus on Genetic Algorithms and Ant Colony Optimization is presented in section 3. The use of these two algorithms within the Decision Tree Induction Framework is described in section 4, together with the description of the algorithm for modelling multi-attribute response. Section 5 summarizes the results of the proposed method on real and simulated datasets. Concluding remarks are presented in section 6. The chapter also includes an appendix that presents J-Fast, a Java-based software for Decision Tree that currently implements Genetic Algorithms and Ant Colony Optimization

    Investigating Shared Genetic Bases between Psychiatric Disorders, Cardiometabolic and Sleep Traits Using K-Means Clustering and Local Genetic Correlation Analysis

    Get PDF
    Psychiatric disorders are among the top leading causes of the global health-related burden. Comorbidity with cardiometabolic and sleep disorders contribute substantially to this burden. While both genetic and environmental factors have been suggested to underlie these comorbidities, the specific molecular underpinnings are not well understood. In this study, we leveraged large datasets from genome-wide association studies (GWAS) on psychiatric disorders, cardiometabolic and sleeprelated traits. We computed genetic correlations between pairs of traits using cross-trait linkage disequilibrium (LD) score regression and identified clusters of genetically correlated traits using k-means clustering. We further investigated the identified associations using two-sample mendelian randomization (MR) and tested the local genetic correlation at the identified loci. In the 7-cluster optimal solution, we identified a cluster including insomnia and the psychiatric disorders major depressive disorder (MDD), post-traumatic stress disorder (PTSD), and attention-deficit/hyperactivity disorder (ADHD). MR analysis supported the existence of a bidirectional association between MDD and insomnia and the genetic variants driving this association were found to affect gene expression in different brain regions. Some of the identified loci were further supported by results of local genetic correlation analysis, with body mass index (BMI) and C-reactive protein (CRP) levels suggested to explain part of the observed effects. We discuss how the investigation of the genetic relationships between psychiatric disorders and comorbid conditions might help us to improve our understanding of their pathogenesis and develop improved treatment strategies

    Chapter Decomposing tourists’ sentiment from raw NL text to assess customer satisfaction

    Get PDF
    The importance of the Word of Mouth is growing day by day in many topics. This phenomenon is evident in everyday life, e.g., the rise of influencers and social media managers. If more people positively debate specific products, then even more people are encouraged to buy them and vice versa. This effect is directly affected by the relationship between the potential customer and the reviewer. Moreover, considering the negative reporting bias is evident in how the Word of Mouth analysis is of absolute interest in many fields. We propose an algorithm to extract the sentiment from a natural language text corpus. The combined approach of Neural Networks, with high predictive power but more challenging interpretation, with more simple but informative models, allows us to quantify a sentiment with a numeric value and to predict if a sentence has a positive (negative) sentiment. The assessment of an objective quantity improves the interpretation of the results in many fields. For example, it is possible to identify crucial specific sectors that require intervention, improving the company's services whilst finding the strengths of the company himself (useful for advertising campaigns). Moreover, considering that the time information is usually available in textual data with a web origin, to analyze trends on macro/micro topics. After showing how to properly reduce the dimensionality of the textual data with a data-cleaning phase, we show how to combine: WordEmbedding, K-Means clustering, SentiWordNet, and the Threshold-based NaĂŻve Bayes classifier. We apply this method to Booking.com and TripAdvisor.com data, analyzing the sentiment of people who discuss a particular issue, providing an example of customer satisfaction

    Fractal analysis of Dow Jones industrial index returns

    Get PDF
    The Dow Jones Industrial Average 30 (DJIA30) Index was analyzed to show that models based on the Fractal Market Hypothesis (FMH) are preferable to those based on the Efficient Market Hypothesis (EMH). In a first step, Rescaled Range Analysis was applied to search for long term dependence between index returns. The Hurst coefficient was computed as a measure of persistence in the trend of the observed time series. A Monte Carlo simulation based on both Geometric Brownian Motion (GBM) and Fractional Brownian Motion (FBM) models was used in the second step to investigate the forecasting ability of each model in a situation where information about future prices is lacking. In the third step, the volatility of the index returns obtained from the simulated GBM and FBM was considered together with that produced by a GARCH(1,1) model in order to determine the approach that minimizes the Value at Risk (VaR) and the Conditional Value at Risk (CVaR) of one asset portfolio where the DJIA30 index underlies an Exchange Traded Commodity (ETC). In the case observed returns could either follow a gaussian distribution or a Pareto distribution with a scale parameter equal to the inverse of the Hurst coefficient determined in the first step

    The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees

    Get PDF
    This paper introduces the Bradley-Terry regression trunk model, a novel probabilistic approach for the analysis of preference data expressed through paired comparison rankings. In some cases, it may be reasonable to assume that the preferences expressed by individuals depend on their characteristics. Within the framework of tree-based partitioning, we specify a tree-based model estimating the joint effects of subject-specific covariates over and above their main effects. We, therefore, combine a tree-based model and the log-linear Bradley-Terry model using the outcome of the comparisons as response variable. The proposed model provides a solution to discover interaction effects when no a-priori hypotheses are available. It produces a small tree, called trunk, that represents a fair compromise between a simple interpretation of the interaction effects and an easy to read partition of judges based on their characteristics and the preferences they have expressed. We present an application on a real dataset following two different approaches, and a simulation study to test the model's performance. Simulations showed that the quality of the model performance increases when the number of rankings and objects increases. In addition, the performance is considerably amplified when the judges' characteristics have a high impact on their choices

    Gene silencing by RNAi in mouse Sertoli cells

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>RNA interference (RNAi) is a valuable tool in the investigation of gene function. The purpose of this study was to examine the availability, target cell types and efficiency of RNAi in the mouse seminiferous epithelium.</p> <p>Methods</p> <p>The experimental model was based on transgenic mice expressing EGFP (enhanced green fluorescent protein). RNAi was induced by in vivo transfection of plasmid vectors encoding for short hairpin RNAs (shRNAs) targeting EGFP. shRNAs were transfected in vivo by microinjection into the seminiferous tubules via the rete testis followed by square wave electroporation. As a transfection reporter, expression of red fluorescent protein (HcRed 1) was used. Cell types, the efficiency of both transfections and RNAi were all evaluated.</p> <p>Results</p> <p>Sertoli cells were the main transfected cells. A reduction of about 40% in the level of EGFP protein was detected in cells successfully transfected both in vivo and in vitro. However, the efficiency of in vivo transfection was low.</p> <p>Conclusion</p> <p>In adult seminiferous epithelial cells, in vivo post-transcriptional gene silencing mediated by RNAi via shRNA is efficient in Sertoli cells. Similar levels of RNAi were detected both in vivo and in vitro. This also indicates that Sertoli cells have the necessary silencing machinery to repress the expression of endogenous genes via RNAi.</p

    Fractal analysis of Dow Jones Industrial Index returns

    Get PDF
    The Dow Jones Industrial Average 30 (DJIA30) Index was analyzed to show that models based on the Fractal Market Hypothesis (FMH) are preferable to those based on the Efficient Market Hypothesis (EMH). In a first step, Rescaled Range Analysis was applied to search for long term dependence between index returns. The Hurst coefficient was computed as a measure of persistence in the trend of the observed time series. A Monte Carlo simulation based on both Geometric Brownian Motion (GBM) and Fractional Brownian Motion (FBM) models was used in the second step to investigate the forecasting ability of each model in a situation where information about future prices is lacking. In the third step, the volatility of the index returns obtained from the simulated GBM and FBM was considered together with that produced by a GARCH(1,1) model in order to determine the approach that minimizes the Value at Risk (VaR) and the Conditional Value at Risk (CVaR) of one asset portfolio where the DJIA30 index underlies an Exchange Traded Commodity (ETC). In the case observed returns could either follow a gaussian distribution or a Pareto distribution with a scale parameter equal to the inverse of the Hurst coefficient determined in the first step
    • …
    corecore